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ABSTRACT. In mixture experiments with noise variables or process variables that can not be 
controlled, investigate and try to control the variability of the response variable is very important 
for quality improvement in industrial processes. Thus, modeling the variability in mixture 
experiments with noise variables becomes necessary and has been considered in literature with 
approaches that require the choice of a quadratic loss function or by using the delta method. 
In this paper, we make use of the delta method and also propose an alternative approach, 
which is based on the Joint Modeling of Mean and Dispersion (JMMD). We consider a mixture 
experiment involving noise variables and we use the techniques of JMMD and of the delta method 
to get models for both mean and variance of the response variable. Following the Taguchi’s ideas 
about robust parameter design we build and solve an optimization problem for minimizing the 
variance while holding the mean on the target. At the end we provide a discussion about the 
two methodologies considered. 

Keywords: joint modeling of mean and dispersion, delta method, mixture experiments with 
noise variables, robust parameter design. 


1 Introduction 

Experiments with mixtures involve the mixing or blending of two or more ingredients 
to form an end product. For this type of experiment it is of interest to determine the 
proportions of the mixture components which lead to desirable results with respect to 
some quality characteristic of interest. 

In general, the quality of the end product is a function of the proportions of the in¬ 
gredients and of other factors that do not form any portion of the mixture, as heat or 
time; these factors are called process variables or process conditions and can not always 
be controlled. In other words, in some mixture experiments, the response depends not 


1 


only on the proportion of the mixture components present in the mixture but also on 
the processing conditions that are, in general, designated as process variables. Process 
variables are factors in an experiment that do not form any portion of the mixture but 
whose levels when changed could affect the blending properties of the ingredients. In 
the literature on mixture experiments with process variables, the goal is determining the 
proportions of the mixture components along with situations of process conditions, so 
that the response becomes as close as possible to a target value. 

Experiments with mixture and process variables are well covered in [3], but modeling of 
variance is not considered. The modeling of variance in mixture experiments with noise 
variables has been considered in HZ], who proposed a combined mixture-process-noise 
variable model, built and solved an optimization problem to minimize a quadratic loss 
function, taking into account both mean and variance of response. Another approach 
to modeling the variance in mixture experiments is due to [B] using the delta method, 
which employs a Taylor series approximation of the regression model at a vector of process 
variables. 

In this paper, besides the delta method, we also consider the joint modeling of mean 
and dispersion (JMMD) as an alternative approach to modeling the variance in mixture 
experiments with noise variables. The JMMD was introduced by m as an alternative 
to Taguchi’s methods in quality-improvement experiment and provides a methodology to 
find and check the fit of the models found with a solid statistical basis. Further examples 
of applications of the JMMD can be found in [7] and [5]. A a comprehensive review of 
the joint modeling of mean and dispersion proposed by m is given by m- 

We consider a mixture experiment involving noise variables and we use the approaches of 
JMMD and the delta method to get models for both mean and variance of the response 
variable. Following the Taguchi’s ideas about robust parameter design, see [15] we build 
and solve an optimization problem for minimizing the variance while holding the mean 
on the target. At the end we present some considerations about both methods used. 

The paper is organized as follows. In Section 2, we introduce mixture experiments and 
present some models used for mixtures with process variables. In Section 3, we describe 
briefly the joint modeling of mean and dispersion and discuss the principal points of 
the theory. In Section 4, we make a resume about the delta method. In Section 5, we 
apply the JMMD and the delta method to get models for both mean and variance in 
an example of mixture experiment involving noise variables. Additionally we build and 
solve an optimization problem for minimizing the variance while holding the mean on 
the target. Finally, in Section 6, we provide a discussion about the two methodologies 
considered. 
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2 Mixture experiments 


A mixture experiment involves mixing proportions of two or more components to make 
different compositions of an end product. Mixture component proportions Xi are subject 
to the constraints 


0 < Xi < 1 i = l,2,...,a and Xi = 1 (1) 

i=l 

where a is the number of components involved in the mixture experiment. Consequently, 
the design space is a (a — l)-dimensional simplex or part of a simplex if there are further 
conditions on the proportions such as U < Xi < Ui for i = l,2,...,a — 1, with the 
proportion Xa taking values which make up the mixture. 

If, in addition to the a mixture components x* = (xi,...,Xa), there are b process 
variables z* = {zi,..., Zb); we can consider typically additive models like ri{x,z) = 
C(a;) + '&{z) or complete cross product models of the type ri{x,z) = ((x)i9(z) or com¬ 
binations of these such as r]{x,z) = ((x) -|- v{x,z), where Ci^) represents the mixture 
model, i9(z) represents the process variable model and i/{x,z) comprises products of 
terms in s-nd 1 ^( 2 ). For three mixture components and two process variables, with 
C(a=) = ELi a®* + ELi Ej>i /3ijX^XJ + P 123 X 1 X 2 X 3 + ELi Ej>i IzjXiXjixi - Xj) and 
1 ^( 2 ;) = oo + ELi ai2* + ELi + <xi 2 ZiZ 2 , the combined multiplicative model, which 

includes the Scheffe cubic model for the mixture and the reduced quadratic model for 
the process variables, is given by 


3 2 3 2 3 

7i{x,z) = P°jXiXj -f /3 i232^1®2X3 + 

1—1 j>l 1—1 j>l 


i^l 


b) + E 


1^1 


2 3 


E/3'a^. + EE PljXiXj -I- (3[23XiX2X3 + 

i—1 i—1 j>i 


2 3 

EEbiia;iXj(xi -Xj) 

i—1 j>i 


Zl 


E 


2 3 


i—1 i—1 j>i 


2 3 


P123X1X2X3 + Y^Y.^-x.xM-x,) 

i—1 j>i 


Zl 


E 


2 3 2 3 

EE PljXiXj + l5\l^XiX2X3 + 

i=l j>i i=l j>i 


Z 1 Z 2 (2) 

In general, the methodology used to construct mixture designs involving process variables 
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is combination of two designs, one being a mixture design for the mixture components 
and the other being factorial or fractional factorial design for the process variables. For 
more details about mixture experiments with process variables see [3]. 


3 Joint modeling of mean and dispersion 

According to m. the method of joint modeling of mean and dispersion consists of finding 
joint models for the mean and dispersion. In their approach, using the extended quasi¬ 
likelihood, two interlinked generalized linear models (GLM) are needed, one for the mean 
and the other for the dispersion. For the dispersion model is used as response variable 
the deviance component 



(3) 


for each observation j/i, where fXi represents the mean for the Ah observation and V{.) is 
the variance function for the GLM, see m. p. 360. 

Let X* = {xi,..., Xa) be a vector with design factors and z*- = {zi,..., Zh) be a vector with 
noise factors. Suppose that = (ti,..., G) and m* = (ui,..., Ur) are the independent 
variables that affect the mean and the dispersion models respectively. The vectors t 
and u may contain factors occurring in a;, in 2 : or in both. The independent variables 
for the dispersion model are commonly, but not necessarily, a subset of the independent 
variables for the mean model. 

Gonsider Yi,..., n independent random variables with the same probability distri¬ 
bution, whose values, given by yi,...,y„, are the results of an experimental arrange¬ 
ment. Spite of the distribution of Yi is unknown, it is assumed that E{Yi\Zi) = and 
Var{Yi\Zi) = where Zi is a vector with the fixed values for the random noise 

variables, (pi is the dispersion parameter and V (.) is the variance function. Let be a link 
function for the mean model, i.e., 77 * = = f\U)P with f*{U) = {fi{ti),fpiU)) 

where fj{ti), for j = l,...,p, is a known function of ti and /3 is a p x 1 vector 
of unknown parameters. Following [7], for the dispersion model we are assuming a 
gamma model with a log link function, i.e., Ti = ijj{4>i) = ln((/>i) = g*(ui)j, with 
g*{ui) = {gi{ui),... ,gq{ui)), where gj{ui), for j = is a known function of 

Ui and 7 is a g X 1 vector of unknown parameters. Thus, on the dispersion model we are 
considering E{di) = pi and Var{di) = 2pf (see [TT], p. 361). Note that, in general, the 
factors occurring in /(t) and g{u) can be linear effects, quadratic effects or interactions 
between the factors occurring in x and in 2 . A term occurring in the mean linear predic¬ 
tor only can thus be used to get the mean close to target, while a term in the dispersion 
linear predictor, whether or not it occurs also in the mean, can be used to reduce the dis¬ 
persion. We also define T and U the experimental matrices for the mean and dispersion 
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models respectively, with T = [/(ti),..., f{tn)Y ^ = [^(“i)) ■ • ■ where n 

is the number of observations. 

The fitting for the JMMD uses as an optimizing criterion the idea of extended quasi¬ 
likelihood, introduced by m , see m,p- 349. In this work we use the adjusted extended 
quasi-likelihood, introduced by [7]. The adjusted extended quasi-likelihood is given, in 
our notation, by 


-2Q+ = 


2=1 


-I- ln{27r(;iiil/(yi)} 


(4) 


where d* = is the standardized deviance component and hi is the ith element of 
the diagonal of the matrix H = wiT{T*WT)~^TW^ , being W, the weight matrix 
for the GLM, a diagonal matrix with elements given by Wi = ’JITJJT)' 4"^® Table 

[T] gives a resume of the joint modeling of mean and dispersion. From Table [TJ we can 
observe that the standardized deviance component from the model for the mean becomes 
the response for the dispersion model, and the inverse of fitted values for the dispersion 
model provides the prior weights for the mean model. 


Table 1: Summary of the JMMD 


Component 

Mean model 

Dispersion model^ 

Response variable 

y* 

d* 

Mean 

/if 

4^i 

Variance 


2</>? 

Link function 

m = ‘fihi) 


Linear predictor 

= f{u)fd 

II 

Deviance component 



Prior weight 

1 

^ 

(1 - h,)/2 


fpor the dispersion model we are assuming a gamma model with logarithmic link function 


The algorithm for the JMMD is an extension of the standard GLM algorithm, in which 
the model for the mean is fitted assuming that the fitted values for the dispersion are 
known and that the model for dispersion is fitted using the fitted values for the mean. The 
fitting alternates between the mean and dispersion models until convergence is achieved. 

After complete convergence, that is, if the equation (IT6l) is satisfied for a small e, and after 
model checking the final joint model is found. The iterative process and the algorithm 
for the JMMD, adapted from are shown in Appendices A and B. 

Note that the models for mean and variance, conditional to random noise variables, are 
obtained a.s E{Y\Z) = fi = [ri{x, Z)] and Var{Y\Z) = = exp {^{x, Z)} V. 

We can find E{Y) and Var{Y) by using the expressions E(Y) = E{EiY\Z)) and 
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Var{Y) = Var{E(Y\Z)) + E{Var(Y\Z)). It is assumed that the distribution of Z 
is known or can be estimated. 


4 Delta method 

The delta method is a well-known technique for finding approximations, based on Taylor 
series expansions, to the mean and variance of functions of random variables. In this 
paper, the delta method will be applied, considering the second-order expansion of Taylor 
series around the mean of noise variables. Let Zi,..., Z^he random variables with means 
and define Z* = {Zi,..., Zh) and fE = {fii,..., fit)- For our problem of 
mixture experiments, we also consider x as the vector of mixture components. Suppose 
there is a twice differentiable function rf(x, Z) for which we want an approximate estimate 

d 

of mean and variance. Define rj' {x, /x) = {x, z) 

The second-order Taylor series expansion of 77 about fi is 

Tf{x, z) = r]{x, fi) + {z- fi)*rf\x, fi) + {z - fifUiz - /x) -t 7^, (5) 

where TZ is the remainder of the approximation and will be ignored. Thus, taking Y = 
ri{x, Z) we can obtain 

E{Y) = E [rf{x, Z)] = ifix, fi) + E [{Z - fifUiZ - /x)] (6) 

and 

Var{Y) = Var [rf{x, Z)] = Var [{Z - fifri'{x, fi) + [Z - fifUiZ - /x)] . (7) 

For a comprehensive treatment about the delta method see [2], p. 240. 


and H = 




dzdz* 


r]{x,z) 


5 Application to a bread-making problem 

The bread-making problem, originally presented by [S], according to m, consisted of 
an experiment with three ingredients of mixture and two noise variables, and had as 
objective to investigate and to value the final quality of flour, composed by different 
mixtures of wheat flour, for production of bread. [5], according to [12], considered three 
types of wheat flour: two Norwegian, Tjalve (ii) and Folke {X 2 ) and one American, 
Hard Red Spring (a; 3 ); that were considered as control variables, and two types of noise 
variables: mixing time (zi) and the proofing (resting) times of the dough (^ 2 ), considered 
as noise variables. The response variable was considered as the loaf volume after baking 
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with target value of 530 ml. The flour blends were considered to be mixing ingredients 
with xi + X 2 + X 3 = 1 and with constraints 0.25 < a;i < 1.0; 0 < X 2 < 0.75 and 
0 < ^3 < 0.75, where cci, X 2 and X 3 are the proportions of Tjalve, Folke and Hard Red 
Spring flour, respectively. For the noise variables, it was considered three situations for 
the mixing time: 5, 15 and 25 minutes and also three situations for proofing time: 35, 
47.5, and 60 minutes. 

A full 3^ factorial design was used for the noise variables and the 10 runs corresponding 
to a simplex lattice design were replicated at each of the nine combinations of the mixing 
and proofing times, so that the complete design involved 90 experimental runs as shown 
in Figure [H We consider the noise variables coded as zi = (mixing time — 15)/10 and 
Z 2 = (proofing time — 47.5)/12.5. 



Figure 1: The complete design in two noise variables, zi and 22 , and a simplex-lattice design in three mixture 
components xi, X 2 and x^. 


The volumes recorded for the 10 flour types and the 9 combinations of the noise variables 
are reproduced in Tabled using the run numbers, from 1 to 10, as identified in Figure[T] 
Additional details of the way in which the experiment was conducted are given by [12) . 
and further description of the practical aspects of the study is provided by [S] according 

to [ig. 


Table 2: Loaf volume for the 10 flour types and the 9 combinations of the noise variables 









INoise l^’actors 





Desig 

n factors 

zi -1 

0 

1 

-1 

0 

1 

-1 

0 

1 


XI 

^2 

®3 

22 -1 

-1 

-1 

0 

0 

0 

1 

1 

1 

1 

0.25 

0.75 

0.00 

378.89 

396.67 

392.22 

445.56 

452.22 

487.78 

457.22 

500.56 

472.78 

2 

0.50 

0.50 

0.00 

388.89 

423.33 

416.11 

460.00 

488.89 

475.78 

472.78 

478.00 

506.11 

3 

0.75 

0.25 

0.00 

426.11 

483.33 

389.44 

474.44 

514.44 

462.78 

506.67 

591.67 

522.22 

4 

1.00 

0.00 

0.00 

386.11 

459.11 

423.33 

458.33 

506.11 

514.44 

545.56 

522.22 

551.11 

5 

0.25 

0.50 

0.25 

417.78 

437.22 

444.56 

484.44 

490.00 

495.00 

497.78 

531.11 

577.78 

6 

0.50 

0.25 

0.25 

389.44 

447.22 

415.00 

490.89 

528.89 

507.78 

517.78 

567.22 

538.33 

7 

0.75 

0.00 

0.25 

448.33 

459.44 

455.56 

436.00 

535.00 

552.22 

507.44 

578.89 

590.00 

8 

0.25 

0.25 

0.50 

413.89 

485.56 

462.22 

483.89 

529.44 

540.00 

565.00 

598.89 

580.56 

9 

0.50 

0.00 

0.50 

415.56 

514.44 

437.78 

493.89 

583.33 

578.89 

524.44 

694.44 

640.00 

10 

0.25 

0.00 

0.75 

432.78 

498.33 

517.22 

474.44 

568.33 

579.44 

541.11 

638.89 

638.89 
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Naes et al. m carried out a study of the use of robust design methodology to the bread¬ 
making problem to investigate the underlying relationships between the response variable 
loaf volume and the mixture and noise variables, comparing three techniques for analysing 
the loaf volume, i.e., the mean square error, the analysis of variance and the regression 
approach, where all factors, the three mixtures components and the two noise variables, 
are modeled simultaneously. In the analysis carried out by them, the full crossed model 
® for three mixture ingredients and two noise variables was taken as the starting model, 
however a detailed argument was presented for reducing the number of parameters by 
removing some of the second and third-order mixture terms before performing the cross. 
So the initial reduced model had 28 terms. After Backwards elimination using regression 
methods, the final model with 18 terms was obtained. Naes et al. [12] obtain their results 
considering a homoscedastic Gaussian model for the response variable. 

In this paper, we reexamine the bread-making problem considering the possibility of 
obtaining, in addition to the mean model, a model for variance. In our analysis, we 
will consider the Taguchi’s ideas on Robust Parameter Design (RPD). According to [TOj . 
RPD is an approach to product realization activities that emphasizes choosing the levels 
of controllable factors (or parameters) in a process or product to achieve two objectives: 

i) to ensure that the mean of the output response is at a desired level or target and 

ii) to ensure that the variability around this target value is as small as possible. In a 
problem of robust parameter design we must obtain models for both mean and variance 
of the process or product. Thus, we can minimize the variability by finding the optimum 
settings of factors that affect the variance model and then adjusting the mean to its 
target value by finding appropriate settings of factors that affect the mean model. For 
our purpose, we will use two distinct methodologies to get the models for mean and 
variance: the joint modeling of mean and dispersion and the delta method. 


5.1 Modeling of mean and variance via JMMD 

For we apply the JMMD we need to choose a variance function and a link function 
for the mean model, besides the independent variables for the models of the mean and 
dispersion. Naes et al. [12] have considered a Gaussian model in their analysis; thus, 
initially we consider the model for the mean like Gaussian with identity link function and 
variance function R(/i) = I. For the dispersion model, as mentioned before in Section jS] 
it is assumed a gamma model with logarithmic link function. These assumptions will be 
checked in our analysis. 


5.1.1 Model building strategy 


We start our analysis with the same initial reduced model, involving 28 terms, considered 
by [H] and given in equation ([5]). 


r]{x,z) = + f 32 X 2 + P 3 X 3 + ^° 2 XiX 2 + ^ 13 X 1 X 3 + ^° 2 XiX 2 {xi - X 2 ) + 

"fl3XiX3{xi - X3) + {I 3 \xi + 131 x 2 + Plx3 + Pl2^lX2 + PI3X1X3 + 

7 i 2 a;ia: 2 (a:i - X2) + 713X1X3(0:1 - X3)}zi + {/ 3 lxi + / 3 |x 2 + 

+ PI2X1X2 + PI3X1X3 + 712X1X2(xi - X2) + 

7 i 3 XiX 3 (Xi - X 3 )}Z 2 + {/?“xi + / 3 “x 2 + / 3 “x 3 + PllxiX2 + 

P\lxiX3 + 7i2XiX2(xi - X 2 ) + 7i3XiX3(xi - X3)}2:i (8) 

We consider a linear regression model for the mean fitted by Ordinary Least Squares 
(OLS) method . After the stepwise backward method we have found a model with 18 
terms given in Table [3] We observe that the obtained model was the same as that found 

by [H]. 


Table 3: Regression coefficients for the mean model by OLS method 


Terms 


Estimate 

Std. Error 

t value 

p-value 

x-i 


484.624 

6.363 

76.161 

0.0000 

X2 


474.875 

13.369 

35.521 

0.0000 



436.381 

64.837 

6.730 

0.0000 

XiXz 


468.313 

164.234 

2.851 

0.0057 

XlX2(Xl 

— X2) 

375.341 

94.623 

3.397 

0.0002 

X1X3(X1 

- ‘X3) 

-403.031 

199.679 

-2.018 

0.0473 

XlZi 


16.768 

5.452 

3.076 

0.0029 

X3Z1 


51.876 

8.406 

6.171 

0.0000 

X1X2(xi 

- 2 : 2 )^! 

-144.553 

60.706 

-2.381 

0.0199 

X1Z2 


54.933 

6.703 

8.195 

0.0000 

X2Z2 


42.504 

8.470 

5.018 

0.0000 

X1X3Z2 


188.762 

25.167 

7.500 

0.0000 

XiX3(xi 

- X 3 )Z 2 

-202.822 

61.681 

-3.288 

0.0016 

X2Z3 


-52.644 

14.972 

-3.516 

0.0008 

X3Z1 


164.077 

79.249 

2.070 

0.0420 

XiX3Z^ 


-600.046 

199.173 

-3.013 

0.0036 

X1X2(xi 

- x:2)z\ 

-440.721 

109.730 

-4.016 

0.0001 

XIX3(X1 

- X 3 )z 3 

525.480 

244.486 

2.149 

0.0349 


In our analysis, in order to make inference for both models of mean and dispersion, 
following the algorithm shown in Appendix B, we create a package implemented in the 
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R system for statistical computing m- However, other software could be used to obtain 
these models. 

For comparison between the competitors joint models, based on the ideas of [1], we 
consider a kind of quasi Akaike’s Information Criterion given by AICq = —2Q\ + 2{p+q), 
where p and q are the number of parameters in the models of the mean and dispersion, 
respectively. The best joint model is one that has the lowest value of AICq. A global 
measure of variation explained by the fitted joint model can be obtained by calculating 
the pseudo {Rp), defined as the square of the sample correlation coefficient between 
rj and (p{y). We can observe that 0 < Rp < 1 and the closer to 1 it is, the greater the 
correlation between the linear predictor and the transformed response observed. 

For the starting joint model (JMq), we consider as a plausible alternative to use the 
same linear predictor for both the mean model (Mg) and dispersion model (Dq), that 
is r]j^^ = , whose terms are given in Table [3] However, other alternatives could be 

considered for the choice of the linear predictors for the mean and dispersion models, 
see comments in [7]. The procedure for selecting the best model for the mean and 
dispersion came from an adaptation of the stepwise backward method, where the models 
for the mean and dispersion were changed alternately. The Table 2] shows the values of 
AICq and Rp for some joint models considered. In Tabled the linear predictor for the 
mean model Mj is the linear predictor for the mean model Mi removed the terms in 
braces, that is, — {terms}. The linear predictor for the dispersion model is 

obtained analogously. We note that when terms are removed from models of the mean 
and dispersion the values of AICq and Rp are get worse. The best joint model considered 
is JM 2 with = ? 7 „^, - {xizi, X 2 z‘{}, AICq = 809.8640 and R^ = 0.9199. 

In order to verify that the model JM 2 exhibits non-constant variance, we employ the 
studentized Breusch and Pagan test [T] which is available in R in the Imtest package m- 
The value provided by the test is 26.1733 with p-value = 0.03624, indicating, in fact, 
that the variance is not constant. 


Table 4: Values of AICq and R^ for comparison of joint models 


Joint model 

Linear predictor^ 


AICq 


JMo 

(terms given i 

(terms given i 

in Table 

n Table 

813.8589 

0.9199 

JMi 

^ ^Mq 

21} 

826.3424 

0.8995 

JM2 

’'Ms ^ ''Mq 

“ Hi 

Zl. 2:2^1} 

809.8640 

0.9199 

JM3 

’'M 3 = ’'Ml - C 
^^3 = 

13:2(3:1 - 3:2)21} 

21 , 3:10:2(3:1 — 3:2)21, 3:13:322, 3:13:3(3:1 - 3:3)22} 

819.3661 

0.8995 


~ ~ {terms} is the linear predictor removed the terms in braces. 

^ ~ {terms} is the linear predictor ^, removed the terms in braces. 
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Note that the extended quasi-likelihood is based on a saddlepoint approximation to an 
exponential family likelihood, i.e., the GLM family, as point out by [9], p. 81. Thus, 
we can build an approximate likelihood ratio test. Suppose we are considering two 
nested (extended quasi likelihood) models EQLl, with ni parameters, and EQL2, with 
n 2 parameters, such that EQLl C EQL2 and therefore n 2 > ni. Let —2Q\^ and 
— 2 Q '^2 be the adjusted extended quasi likelihood, given in equation (jd]), for the two 
models respectively. As well as InL Ri Q\, where L is the likelihood function, the 
approximate likelihood test is given by —2{Q\^ — < 5 ^ 2 ) Xns-ni- analysis, we 

consider the dispersion model as fixed and the test is built only for the terms in the mean 
model. Thus, we are considering EQLl as a full model with p + q parameters, that is, p 
parameters for the mean model and q parameters for the dispersion model. The EQL2 
model is the EQLl model with a term less in the mean model, withp-fg— 1 parameters. 
Thus, the approximate likelihood test is given by A = — Q\) ^ xfy where —2Q^ 

is the adjusted extended quasi likelihood for the full model with p + q parameters, —2Q\^ 
is the adjusted extended quasi likelihood for the full model removed the term x, with 
p + q — 1 parameters and Xi represents the chi-square distribution with one degree of 
freedom. 

For the dispersion model we use the analysis of deviance, because in this case we are 
assuming a true GLM. The analysis is conducted considering the model for mean as fixed, 
that is, for each iteration of the algorithm, shown in Appendix B, the analysis of deviance 
is performed after the mean model has been obtained. The statistic D^ — D‘^ ~ where 

D‘^ is the deviance for a dispersion model with q parameters, is the deviance for a 
dispersion model removed the term x with q — 1 parameters. The deviance for the 
dispersion model is given by where dd^ is the deviance component given in 

Table [T] 

Note that we could have used the approximate likelihood ratio test to analyse the terms 
of the dispersion model, but, as we are assuming the model for dispersion as known, 
we prefer to use the deviation analysis. Also note that the criterion of approximate 
information could be used in both mean and dispersion analysis, but this was not carried 
out. Finally, we consider the Wald’s method in both mean and dispersion analysis. 

The tests for significance of parameters in the models of the mean and dispersion in the 
joint model JM 2 are given in Tables [5] and [H respectively. 

As pointed out by [3] , for the residual analysis in the mean model we use the standardized 

deviance residual, given by r™ = sign{yi — and rf = sign{di — for the 

residual analysis in the dispersion model, with ^ and dj. = where 

hd- is the ith element of the diagonal of V^U(U^VU)~^UV^ (see Appendix A). The 
goodness of fit of the mean and dispersion models is assessed using different types of 
diagnostic displays shown in the Figures [2] and [3] respectively. 
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Table 5: Regression coefficients for the mean model in JMMD 


Terms 


Estimate 

Std. Error 

Wald Method 

EQD Method t 


t value 

p-value 

-20a. 

Chi-Sq. value 

p-value 

XI 


482.801 

6.225 

77.562 

0.0000 

1023806.141 

1023064.278 

0.0000 

^2 


470.863 

11.477 

41.026 

0.0000 

217854.278 

217112.415 

0.0000 

^3 


437.682 

21.961 

19.930 

0.0000 

1045693.586 

1044951.723 

0.0000 

X1X3 


488.284 

45.242 

10.793 

0.0000 

260668.354 

259926.491 

0.0000 

xiX2(xi 

- ^ 2 ) 

247.959 

90.910 

2.728 

0.0080 

1706.425 

964.562 

0.0000 

xiX 3 (xi 

- 2 : 3 ) 

-302.267 

87.106 

3.470 

0.0009 

2820.485 

2078.622 

0.0000 

xizi 


14.276 

4.965 

2.875 

0.0053 

757.165 

15.302 

0.0001 

X 3 Z 1 


52.470 

6.782 

7.737 

0.0000 

839.941 

98.078 

0.0000 

xiX2(xi 

- 2 : 2)21 

-138.624 

47.359 

2.927 

0.0046 

761.398 

19.535 

0.0000 

xiZ2 


57.738 

5.930 

9.736 

0.0000 

14109.226 

13367.363 

0.0000 

X2Z2 


52.242 

4.648 

11.240 

0.0000 

3290.771 

2548.908 

0.0000 

xix^Z2 


154.184 

12.388 

12.447 

0.0000 

24544.717 

23802.854 

0.0000 

xiX 3 {xi 

- 23)22 

-281.902 

18.707 

15.069 

0.0000 

2396.114 

1654.251 

0.0000 

3:221 


-42.406 

13.145 

3.226 

0.0019 

833.364 

91.501 

0.0000 

3 : 32 ^ 


143.488 

51.188 

2.803 

0.0065 

1456.674 

714.811 

0.0000 

XlX^zf 


-565.182 

130.336 

4.336 

0.0000 

1750.571 

1008.708 

0.0000 

xiX2ixi 

- 22)21 

-330.179 

102.329 

3.227 

0.0019 

847.981 

106.118 

0.0000 

xixz{x-i 

- 23 ) 2 ^ 

362.238 

168.999 

2.321 

0.0231 

814.162 

72.299 

0.0000 


^ is the value of the EQD for a joint model with fixed dispersion model and with the mean model 

removed the term x, with rip — 1 — 17 parameters. —2Q'^ — 741.863 with rip — 18 parameters. 


Table 6: Regression coefficients for the dispersion model in JMMD 


Terms 


Estimate 


Wald Method 

Deviance Method ^ 


Std. Error 

t value 

p-value 


Chi-Sq. value 

p-value 

XI 


6.028 

0.498 

12.096 

0.0000 

12047.685 

11892.898 

0.0000 

^2 


5.221 

0.708 

7.372 

0.0000 

1248.478 

1093.691 

0.0000 

^3 


18.488 

5.799 

3.188 

0.0021 

126.55e5 

126.55e5 

0.0000 

X1X3 


-47.758 

15.743 

-3.034 

0.0033 

856.731 

701.944 

0.0000 

xiX2{xi 

- ^ 2 ) 

25.977 

7.414 

3.504 

0.0008 

324.438 

169.651 

0.0000 

xix^{xi 

- ^ 3 ) 

52.270 

18.295 

2.857 

0.0056 

2153.342 

1998.555 

0.0000 

X3Z1 


-0.290 

0.567 

-0.511 

0.6111 

155.407 

0.620 

0.4310 

xiX 2 {xi 

- ^ 2)^1 

-5.249 

4.775 

-1.099 

0.2752 

159.466 

4.679 

0.0305 

xiZ2 


-0.456 

0.537 

-0.849 

0.3986 

160.094 

5.307 

0.0212 

X2Z2 


1.846 

0.706 

2.615 

0.0108 

185.159 

30.372 

0.0000 

xix^Z2 


-1.592 

2.078 

-0.766 

0.4461 

157.511 

2.724 

0.0989 

xix^{xi 

- ^ 3)^2 

-6.438 

5.222 

-1.233 

0.2216 

158.718 

3.931 

0.0474 



-15.316 

6.456 

-2.372 

0.0203 

541.654 

386.866 

0.0000 

XlX^zf 


56.251 

17.325 

3.247 

0.0018 

182.66e5 

182.66e5 

0.0000 

xiX 2 {xi 

- ^2)^1 

-32.718 

8.537 

-3.833 

0.0003 

435.597 

280.810 

0.0000 

Xix^(xi 

- ^3)^1 

-51.608 

20.485 

-2.519 

0.0139 

1396.891 

1242.104 

0.0000 


^ is the value of the deviance for the dispersion model removed the term x, with rig — 1 = 17 parameters 
— 154.787, with riq — 18 parameters. In both and the mean model is the same. 
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KigUrO 2) Six diagnostic plots for data of bread mixture - mean model. The upper left panel plots 
standardized deviance residuals against the order of observations, the upper right panel plots the Cook’s 
distances versus the order of observations, the middle left panel displays the diagonal elements of the matrix 
hat against the predicted values, the middle right panel plots the standardized deviance residuals versus the 
linear predictor, the lower left panel displays the half-normal plot of absolute standardized deviance residuals 
with a simulated envelope and the lower right panel presents the predicted values versus observed values. 


The final model for mean is given by 


E{Y\Zi, Z2) = 482.80a;i + 470.860:2 + 437.68x3 + 488.28x1x3 + 247.96xiX2(xi — X2) — 

302.27xiX3(xi — X 3) + [14.28xi + 52.47x3 — 138.62xiX2(xi — X2)]2i + 
[57.74x1 + 52.24x2 + 154.18x1X3 — 281.90xiX3(xi — X3)]2;2 + 

[—42.41x2 + 143.49x3 — 565. 18 x 1 X 3 — 330.18xiX2(xi — X 2 ) + 
392.24xiX3(xi — X3)]2i (9) 

and the model for dispersion is the same model for the variance, i.e., Var{Y\Zi, Z 2 ) 
= (p, because V{fi) = 1, and it is given by 
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Figure 3l Four diagnostic plots for data of bread mixture - dispersion model. The upper left panel plots 
standardized deviance residuals against the linear predictor, the upper right panel plots the absolute standard¬ 
ized deviance residuals versus the linear predictor, the lower left panel displays the normal plot of residuals 
with a simulated envelope and the lower right panel presents the histogram of standardized deviance residuals. 


Var(YIZi,Z 2 ) = ea;p{6.03a;i + 5.22a:2 + 18.49a;3 — 47.76a;ia;3 + 

25.98a;ia;2(2;i — * 2 ) + 52.27a:i2;3(2;i — 2 : 3 ) — 5.25a;i2;2(2;i — 

X 2 )zi + [—0.462:1 + 1 . 842 ; 2 ] z 2 + [—15.322:3 + 56.252:12:3 — 

32.722:12:2(2:1 — X2) — 51.612:12:3(2:1 — X3)]zf} (10) 


To obtain the expressions for E{Y) and Var{Y), we supposed that the coded noise 
variables Zi ~ Zi ~ N{^ 2 ,cf 2 ) ^^e independent random variables. Now, 
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knowing that E{Y) = E{E(Y\Zi, Z 2 )) we have that 

E{Y) = 482.80*1 + 470.862:2 + 437.68a;3 + 488.280:10:3 + 247.96a:ia;2(3;i — *2) — 
302.270:10:3(0:1 — 0:3) + [14.280:1 + 52.470:3 — 138.620:10:2(0:1 — X2)]fJ,l + 

[ 57 . 74 o:i + 52 . 24 o :2 + 154 .180:10:3 — 281 .960:10:3(0:1 — X3)]fj.2 + [— 42 . 41 o :2 + 
143.490:3 — 565.180:10:3 — 330.180:10:2(0:1 — 0:2) + 392.240:1*3(0:1 — *3)](^ti + 

(11) 

where E{Zi) Using the fact that yar(F) = E{Var{Y\Zi, Z 2 ))+Var{E{Y\Zi, 

Z 2 )) and that if Z ^ then E{e°‘^^) = = (l/v^) 

e2" y j with ^2 > 0, and Var{aiZ + = o’^(af + 2Qf|(2/x^ + cr^) + 4aiQ:2/o), 

where ki = aia"^ + ^, ^2 = 1 — 2 cr^Q; 2 , and ai, 0:2 are constant, we have that 


Var{Y) 


£;(gmigm2Zigm3Z2gm4Z?) Var{m 5 Zi + mgZa + rmZf) = 


m^cT^ 

^mi^m3fi.2A - 1 - 2 - 


■\/l — 2(Tim4 

o'i(™5 + 27717(2/0^ + af ) + 477157777/01), 


/ (Tn2cr^+^i)^ 
I 1 — 2ct^ 7714 




+ mlal + 


( 12 ) 


where rrii = 6.03*1 + 5.22*2 +18.49*3 — 47.76*1*3 + 25.98*1*2(*i — 2 : 2 ) + 52.27*i*3(*i — 
2 ^ 3)1 n 72 = — 5 . 25 *i* 2 (*i — * 2 ), 7773 = —0.46*1 + 1.84*2, 7774 = —15.32*3 + 56.25*1*3 — 
32 . 72 *i* 2 (*i — * 2 ) — 51 . 61 *i* 3 (*i — * 3 ), 7775 = 52.47*3, me = 57.74*i + 52.24*2 + 
154.18*1*3 — 281 . 90 *i* 3 (*i — * 3 ), 7777 = —42.41*2 +143.49*3 — 565.18*1*3 — 330.18*1*2 
(*i — * 2 ) + 392 . 24 *i* 3 (*i — * 3 ). 


5.2 Modeling of mean and variance via delta method 

The use of the delta method to obtain models for the mean and variance in mixture 
experiments with process variables was introduced by [6]. Goldfarb et. al [6] consider 
the model for the response variable Y given by F = t ]{ x , w , z ) + e, where x is the vector 
of the mixture components, w is the vector of the controllable process variables, 2 ; is 
the vector of the noise variables, e is the random error with e ~ N{0, a^) and r]{x, w, z) 
is the linear predictor, containing quadratic or special cubic mixture terms, interactions 
between the mixture components and the controllable process variables, interactions 
between mixture components and the noise variables, and interactions among all three 
types of variables. Considering the noise variables Z* = (Zi,..., Zf,), they assume that 
E{Z) = 0 and Var{Z) = Diag{a \,..., a^) is a b x b diagonal matrix with the variances 
of the noise variables on the diagonal. Goldfarb et al. [B] obtain E{Y) and Var{Y) via 
delta method considering a first-order Taylor series approximation of the model around 
the mean of Z, taken as zero. 
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5.2.1 Model building strategy 


As proposed in Section |4l the delta method will be applied considering the second-order 
expansion of Taylor series around the mean of noise variables. For we apply the delta 
method we consider the final model obtained by m whose terms are shown in Table 
[31 Thus, the delta method is applied to the estimated model, given by y = ri{x,z) = 
Cl -f C 2 Z 1 + C 3 Z 2 + C 4 Z 1 , where ci = 484.62a;i -I- 474. 883:2 + 436.38x3 + 468.31xiX3 -f 
375, 34 xiX 2 (xi — X 2 ) — 403.03xiX3(xi —X 3 ), C 2 = 16.77xi-1-51.88x3 —144.55xiX2(xi—X 2 ), 
C 3 = 54. 93 x 1 - 1 - 42 . 50 x 2 - 1 - 188 . 76 x 1 X 3 —202.82xiX3(xi—X 3) and C 4 = -52.64x2-1-164.08x3— 
600.05x1X3 — 440.72 xiX2(xi — X 2 ) -|- 525.48xiX3(xi — X 3 ) represent the constant terms in 
relation to 2 :. 


The response model assumed by 


is y = r]{x,z) + e with e ^ A^(0, cr^), where 

^2 


z^ = (zi, 2 ; 2 )- Thus, considering E{Z) = /x* = (yi,/i 2 ) and Var{Z) = Diag{a‘l,a 2 ), 


from Section 0] we have that rj' {x,fi) = —r](x,z) 

az 


Z=IJ, 



and EL = 




dzdz* 


V{x,z) 


Z=fJ, 


2c4 0 

0 0 


Now, from the equation ([5]) we obtain y = r](x, z) + e = ci+ C 2 yi + C 3 /i 2 + C 4 /xf -I- (21 — 
Mi)(c 2 + 2 c 4 /xi) -|- € 3(22 ~ M 2 ) + 204(21 — Mi)^ + TZ + £. 


Thereafter, using the fact that if Z ^ N{g,a'^) then E{Z^) = +3cr^fj,, Var{Z'^) = 

4a^fi^ + 2a* and Var{aZ+bZ^) = a^Var{Z) + b^Var{Z^)-2ab[E{Z^)-E{Z)E{Z^)], the 
models for mean and variance can be obtained from equations m and 0 , respectively, 
and are given by 


E{Y) = Cl -b C 2 M 1 + C 3 M 2 + £ 4^1 -b 2 c4(Ti 


(13) 


and 


Var{Y) = [cl + 4 c 2 C 4 /ii -b a\ + Sc^af -b Cgcr^ + (14) 

5.3 Optimization process 

Following the Taguchi’s idea for the quality improvement, see [H], after we found the 
equations for E{Y) and Var{Y), we have to solve the following minimization problem 
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Min VariY) 


Subject to 


E(Y) = 530.0 
Xi + X2 + Xz = 1.0 
< 0.25 < a;i < 1.0 
0.0 < X2 < 0.75 
0.0 < X3 < 0.75 


(15) 


where E{Y) and Var{Y) are functions of xi, X 2 , x^, fii, ^ 2 , and (t|. Fort the 
JMMD the expressions for E{Y) and Var{Y) are given by the equations (fTTl) and (fl^ . 
respectively, and we still must assume an additional constraint, i.e., 1 — 2o'^m4 > 0. For 
the delta method the expressions for E{Y) and Var{Y) are given in equations (ITSl) and 
o, respectively, and we use = 58.36 as an estimative for in equation 

d, where D = 4201,615 is the deviance for the model shown in Table O with n = 90 
observations and p = 18 parameters. 

We solve the optimization problem, considering various scenarios involving the values of 
mean and variance for the random variables mixing time and proofing time. The Table [7] 
shows, for the JMMD and the delta method, the optimum combination for the mixture 
and its variance estimated for each scenario involving the noise variables. 


Table 7: Optimal values of mixture for various scenarios involving the noise variables, 
where the models for mean and variance were obtained via JMMD and delta method. 


Mix. Time ^ 

Proof. Time ^ 

JMMD 


Delta Method 


Optimum 

{X 1 ,X 2 ,X 3 ) 

Variance 

estimated 

Optimum 

(Xl, X2,X3) 

Variance 

estimated 

(10.0,6.25) 

(47.50, 9.766) 

(0.303, 0.483, 0.214) 

160.916 

(0.250, 0.063, 0.687) 

691.387 

(12.5,6.25) 

(44.375,9.766) 

(0.310, 0.534. 0.156) 

98.893 

(0.250, 0.066, 0.684) 

548.857 

(15.0,6.25) 

(41.25,9.766) 

(0.306, 0.568, 0.126) 

66.912 

(0.250, 0.037, 0.713) 

473.602 

(15.0,25.0) 

(47.500,39.063) 

(0.300, 0.560, 0.140) 

272.526 

(0.250, 0.168, 0.582) 

1569.286 

(15.0,56.25) 

(47.50,87.891) 

(0.250, 0.541, 0.209) 

1061.528 

(0.250, 0.005, 0.745) 

7406.725 

(20.0,6.25) 

(53.75, 9.766) 

(0.298, 0.598, 0.104) 

215.019 

(0.250, 0.423, 0.327) 

217.547 

(20.0,25.0) 

(53.75,39.0625) 

(0.286, 0.599, 0.115) 

432.377 

(0.250, 0.393, 0.357) 

794.810 

(20.0,56.25) 

(53.75,87.891) 

(0.439, 0.498, 0.063) 

1497.716 

(0.250, 0.326, 0.424) 

2421.067 


+ Q 

Mixing time is a normal random variable with mean (Xm and variance cr^. 

t o 

■‘’Proofing time is a normal random variable with mean fip and variance < 7 ^. 


From the scenarios considered in Table [71 we can observe that, for the delta method, 
the proportion for xi is not affected by changes in the parameters of the noise variables 
and that, for all scenarios considered, the variance estimated using the delta method is 
greater than the variance estimated by the JMMD. Note also that, given the distributions 
for the random variables mixing time and proofing time, the probability distributions 
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for the variables Z\ and are obtained by Z\ ~ Zi ^ 7V(^2,o'|), where 

Ml = (Mm — 15)/10, af = cr^/100, M2 = (Mp ~ 47.5)/12.5 and ct| = crp/156, 25. 

6 Discussion 

In this paper, we have applied the delta method and the joint modeling of mean and 
dispersion in a mixture problem with noise variables, with the goal of finding an optimal 
combination of the mixture ingredients, in order to make the mean of the response 
variable robust to the noise conditions. 

In bread-making example, that had as objective to investigate and to value the final 
quality of flour composed by different mixtures of wheat flour for production of bread, 
the optimal combination for the mixture of flour should be obtained to be robust to the 
mixing and proofing times of the dough, considered as noise variables. The results shown 
in Section [S] give a comprehensive treatment for the problem in each of the approaches 
considered. 

In our analysis we have considered the noise variables as random variables with Gaussian 
distribution and we considered various scenarios involving the noise variables, shown in 
Table [T] We have obtained optimal combinations of mixture that were robust to the 
noise conditions, i.e., for the mixtures obtained is expected that the bread produced will 
have a mean volume of 530 ml, independent of the noise situations for which the mixture 
was exposed. From the scenarios considered in Table [71 we can also observe that the 
variance estimated using the delta method is greater than the variance estimated by the 
JMMD. However, we can not compare the two approaches considered because there is 
no method for such a comparison. 

In the optimization process, shown in Subsection 15.31 the variables Zi and Z 2 were 
considered noise variables, however, in some situations, some process variables can be 
considered controllable variables, this does not alter the procedures shown in Subsec¬ 
tions 15.1.11 and 15.2.11 nevertheless for the optimization process given in Subsection lOl 
such variables are not considered random variables and the optimization process can be 
conducted for fixed values of these variables. 

For the bread-making example, we have considered the Gaussian distribution for the 
noise variables, however other distributions could be considered, for example, the distri¬ 
butions Gamma or lognormal, but the procedure for finding E{Y) and Var{Y) would 
be harder. We also have considered independence between noise variables, but for situ¬ 
ations where this assumption can not be considered, the complexity of the optimization 
process increases. Dependence on noise variables using the delta method with first order 
approximation is considered by [6] and for the JMMD still needs further study. 

The mean model obtained by m was used as base to apply the delta method. The 
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delta method was applied considering the second-order expansion of Taylor series around 
the mean of noise variables, however, we could also have used a less accurate first-order 
approximation. It is worth mentioning that in the case of the delta method, there is only 
an approximation to the variance. That is, there is no statistical model associated with 
the variance, just as there is in the case of JMMD. 

One can not make formal comparisons between our analysis, using the JMMD, and that 
proposed by m because they are different approaches and there is no a measure that 
allows comparisons, however we can use the values of Rp and the graphs of the observed 
values versus the predicted values by the models considered as a criterion for comparison. 
For the model proposed by |12) . whose terms in the linear predictor are shown in Table 
[31 = 0.9189 and for our model, given in equation (jH]), Rp = 0.9199. The Figure 

0] shows the graphs of the observed values versus the predicted values by both models 
considered. It should also be mentioned that as much using the delta method as JMMD, 
we get a model for the variance and it is possible to construct an optimization problem 
which allows to obtain optimum values for the ingredients of the mixture making the 
mean response robust to noise factors. 

It is important to emphasize that when we use JMMD other distributions, that not 
only the Gaussian distribution, could be considered for the mean model, for instance, 
distributions for counts or proportions. However, for example, in case of a model for the 
mean is of Poisson type, with V{^) = fi, the complexity of the optimization problem 
increases, since Var{Y\Z) = = /rexp{g‘(M) 7 }. 


Predicted versus Observed Values 



Figur6 4! Comparison between the two models considered for the mean of response. Filled balls represent 
the values predicted by our model, given in equation lU, using the JMMD. Empty balls represent the values 
predicted by the model obtained by 1121 . whose terms in the linear predictor are shown in Table 
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Appendix A - Iterative process for JMMD 

Adapted from |15| . 


Mean Model 


Let j/i, • • • , yn be independent observations, resulting from n independent random vari¬ 
ables Yi, - ■ ■ ,Yn', ti, ■ ■ ■ ,tp are the p explanatory variables that affect the mean model and 
/3i, • • • , Pp are the unknown parameters of the model. Consider that pd = (/ii, • • • , y„), 
(f>* = and suppose that ElYpZi) = pi and VariYpZi) = 4>iV{p.i) are 

known. We start putting fc = 1, /3 q = (0, • • • , 0), /Xq = (?/i, • • • , y„) and = (1, • • • ,1). 
Now, by using the iterative weighted least squares method, we obtain the p x 1 vector 

where the matrix n x p, T, is the design matrix 
for the mean model; the matrix n x n, = Diag(w(j-i)i, - ■ ■ , ry(j_i)„), is the 

weight matrix for the GLM, with Diag{.) representing a diagonal matrix with the ele¬ 
ments in^he diagonal, and = (r(j_i)i, • • • , r(,_i)„) 

is a n X 1 vector, with -I- - A<(j-i)i) for * = I,-- - ,n and 

j = 1, 2, • • •. In each iteration j (j = 1, 2, • • •) a new is obtained and the process 
continues until a convergence criterion is fulfilled. A possible convergence criterion could 
be ||/3(j) - /3(j_ . 2 ) IP < S, where || || represents the norm of a vector and (5 G K. Af¬ 
ter the convergence is reached, do TVfc = TV(j_i), store the last as /3j, and use it 
to compute the vector n x 1, /x^,, i.e., /Xj, = here p is an invertible known 

function. With the estimated value /x^, compute the vector n x 1, whose elements 
are where dki = and hki is the xth element of the diagonal of 

Hk = WlT{T^WkT)~^TWl. Now, knowing the d*^ vector, we can adjust the dis¬ 
persion model considering the weights for each value for i = 1,..., n. 


Dispersion model 

Given ,dfe„) and the weights (1 — hki,--- ,1 — hkn), let ui,--- ,Uq be 

the q explanatory variables that affect the dispersion model, 7 *p^ = (O,--- ,0), <f>Q = 
[dll,-- - ,d^„) and - , 7 g the unknown parameters of the model. Gonsidering a 
Gamma distribution for the dispersion model and using the iterative weighted least 
squares method, we obtain the g x 1 vector, 7 ^) = ([/*V(j_i)[/) 

where U is the nxq design matrix for the dispersion model; V(j-i) = • • • , U(j_i)„) 

is the nx n matrix of weights for the GLM, with ^ (f 

are the elements in the diagonal; and • • • , is an n x 1 vector 

with for z = 1, • • • , n and j = 1, • • •. In the 
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same way as it was done for the mean model, in each iteration j (j = 1 , 2 , • • •) a new 
is obtained and the process continues until a convergence criterion is fulfilled. After 
the convergence is reached, store the last as 7 ^, and use it to compute the n x 1 
vector cj))., that is, (pf. = {UfP), where ijj is an invertible known function. For the 
dispersion model, tp is generally taken as the logarithmic function, so pk = exp([ 77 ;.). 
Now, with the estimated vector cj)\ = {pki,"' ^Pkn), return to the mean model and 
again use the iterative weighted least squares, but now with the new weights Thus, 
in the mean model, for each j — 1 , the elements of diagonal matrix will be given by 


- (dva-Z) 


‘Pk-vik-c 1 )-) ' fitting alternates between the mean and disper¬ 
sion models until convergence is achieved. For example, a convergence criterion could 
be 


I ‘^QAk ( 


I-2Q+ 


< e, fc = 1 , 2 ,.. 


Ak\ 


(16) 


where is the extended quasi-deviance adjusted, obtained in the kth cycle, e £ K 

and I I represents the absolute value operator. At the beginning of the process, consider 
-2Q+, = 0. 

Note that in the kth. cycle we must check if the mean and dispersion models are well 
adjusted and if some parameter in the mean or dispersion model should be excluded of 
the joint model. For more details about model checking see [U and 0 . 


Appendix A - Algorithm for JMMD 

Adapted from |15| . 

Consider the assumptions and definitions given in Appendix A. The iterative method for 
the JMMD can be resumed in the following algorithm. 

Start 
Set k = 1 

Do /3* = (0, ••• ,0), = {yi,--- ,yn) and (pl = {1, ■ ■ ■ ,1) 

While the convergence is not achieved (joint model) 

Set j = 1 

While the convergence is not achieved (mean model) 

Compute /3(j) = (r‘W(j_i)T) ^ T*W(j_i)r(j_i) 

If 11%)-%- < S, stop (achieved convergence) 

Do f3f, = Wk = VF(j_i); = ip-'^(Tf3k) 

Else 
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Do j = j + 1 

End if 
End while 
Compute dl 

Do 7^ = (0,--- = ,dlJ 

Set j = 1 

While the convergence is not achieved (dispersion model) 
Compute 7 (j) = {UWq_i)U) ^ t/*V' q_i)Sq_i) 

If ||7(j) — 7(j_i)|P < S, stop (achieved convergence) 

Do lk = l(j)] Vfe = V(j_i); (f>^ = tp-^{U'yk) 

Else 

Do j = j + 1 

End if 
End while 
Do -2Qio = 0 

If ^ < e, stop (achieved convergence) 

Else 

Do k = k + 1 

End if 
End while 

Do/3 = /3fc; 7=7fc 

End 


24 



